-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: Fix bug with symmetric difference of two equal MultiIndexes GH12490 #13504
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Fix bug with symmetric difference of two equal MultiIndexes GH12490 #13504
Conversation
Current coverage is 84.34%@@ master #13504 diff @@
==========================================
Files 138 138
Lines 51107 51122 +15
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
+ Hits 43103 43117 +14
- Misses 8004 8005 +1
Partials 0 0
|
result_name = result_name_update | ||
|
||
if self.equals(other): | ||
return MultiIndex(levels=[[]] * self.nlevels, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make a constructor that creates an empty MultiIndex._create_as_empty()
and just call here for both of these
@@ -527,3 +527,5 @@ Bug Fixes | |||
|
|||
|
|||
- Bug in ``Categorical.remove_unused_categories()`` changes ``.codes`` dtype to platform int (:issue:`13261`) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put your whatsnew higher up (in an empty space in Bug Fixes) this way you dont have conflicts
If Travis passes again I think I got all your comments fixed
|
return MultiIndex(levels=[[]] * nlevels, | ||
labels=[[]] * nlevels, | ||
names=names, verify_integrity=verify_integrity) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note that this is internal. add a Parameters section
return self._create_as_empty(names=result_name) | ||
|
||
difference = sorted(set((self.difference(other)). | ||
union(other.difference(self)))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a comment:
This sorting breaks with mixed-int values (same issue as in #13432). I suppose this line may be safely replaced with the code from PR #13514, slightly changed, namely:
this = self._get_unique_index()
other = other._get_unique_index()
indexer = this.get_indexer(other)
# {this} minus {other}
common_indexer = indexer.take((indexer != -1).nonzero()[0])
left_indexer = np.setdiff1d(np.arange(this.size), common_indexer,
assume_unique=True)
left_diff = this.values.take(left_indexer)
# {other} minus {this}
right_indexer = (indexer == -1).nonzero()[0]
right_diff = other.values.take(right_indexer)
the_diff = _concat._concat_compat([left_diff, right_diff])
Then if the_diff
is empty, an empty MI should be returned, otherwise:
return self._shallow_copy(the_diff, names=result_name).sortlevel()
(In comparison to Index.symmetric_difference
sorting is postponed until the end.)
Similar change may be applied to MultiIndex.difference
, I guess.
can you rebase / update? |
closing, but pls reopen / comment if you can continue |
git diff upstream/master | flake8 --diff
Fixes a bug where the symmetric difference of two equal MultiIndexes would raise a TypeError. MultiIndex used to use the
Index.symmetric_difference
. With this PR it it's own implementation that is more likeMultiIndex.difference
.